5 research outputs found
Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA
We present BYOKG, a universal question-answering (QA) system that can operate
on any knowledge graph (KG), requires no human-annotated training data, and can
be ready to use within a day -- attributes that are out-of-scope for current
KGQA systems. BYOKG draws inspiration from the remarkable ability of humans to
comprehend information present in an unseen KG through exploration -- starting
at random nodes, inspecting the labels of adjacent nodes and edges, and
combining them with their prior world knowledge. In BYOKG, exploration
leverages an LLM-backed symbolic agent that generates a diverse set of
query-program exemplars, which are then used to ground a retrieval-augmented
reasoning procedure to predict programs for arbitrary questions. BYOKG is
effective over both small- and large-scale graphs, showing dramatic gains in QA
accuracy over a zero-shot baseline of 27.89 and 58.02 F1 on GrailQA and MetaQA,
respectively. On GrailQA, we further show that our unsupervised BYOKG
outperforms a supervised in-context learning method, demonstrating the
effectiveness of exploration. Lastly, we find that performance of BYOKG
reliably improves with continued exploration as well as improvements in the
base LLM, notably outperforming a state-of-the-art fine-tuned model by 7.08 F1
on a sub-sampled zero-shot split of GrailQA
Surveys without Questions: A Reinforcement Learning Approach
The 'old world' instrument, survey, remains a tool of choice for firms to
obtain ratings of satisfaction and experience that customers realize while
interacting online with firms. While avenues for survey have evolved from
emails and links to pop-ups while browsing, the deficiencies persist. These
include - reliance on ratings of very few respondents to infer about all
customers' online interactions; failing to capture a customer's interactions
over time since the rating is a one-time snapshot; and inability to tie back
customers' ratings to specific interactions because ratings provided relate to
all interactions. To overcome these deficiencies we extract proxy ratings from
clickstream data, typically collected for every customer's online interactions,
by developing an approach based on Reinforcement Learning (RL). We introduce a
new way to interpret values generated by the value function of RL, as proxy
ratings. Our approach does not need any survey data for training. Yet, on
validation against actual survey data, proxy ratings yield reasonable
performance results. Additionally, we offer a new way to draw insights from
values of the value function, which allow associating specific interactions to
their proxy ratings. We introduce two new metrics to represent ratings - one,
customer-level and the other, aggregate-level for click actions across
customers. Both are defined around proportion of all pairwise, successive
actions that show increase in proxy ratings. This intuitive customer-level
metric enables gauging the dynamics of ratings over time and is a better
predictor of purchase than customer ratings from survey. The aggregate-level
metric allows pinpointing actions that help or hurt experience. In sum, proxy
ratings computed unobtrusively from clickstream, for every action, for each
customer, and for every session can offer interpretable and more insightful
alternative to surveys.Comment: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19
MedFilter: Improving Extraction of Task-relevant Utterances through Integration of Discourse Structure and Ontological Knowledge
Information extraction from conversational data is particularly challenging
because the task-centric nature of conversation allows for effective
communication of implicit information by humans, but is challenging for
machines. The challenges may differ between utterances depending on the role of
the speaker within the conversation, especially when relevant expertise is
distributed asymmetrically across roles. Further, the challenges may also
increase over the conversation as more shared context is built up through
information communicated implicitly earlier in the dialogue. In this paper, we
propose the novel modeling approach MedFilter, which addresses these insights
in order to increase performance at identifying and categorizing task-relevant
utterances, and in so doing, positively impacts performance at a downstream
information extraction task. We evaluate this approach on a corpus of nearly
7,000 doctor-patient conversations where MedFilter is used to identify
medically relevant contributions to the discussion (achieving a 10% improvement
over SOTA baselines in terms of area under the PR curve). Identifying
task-relevant utterances benefits downstream medical processing, achieving
improvements of 15%, 105%, and 23% respectively for the extraction of symptoms,
medications, and complaints.Comment: Accepted as Long Paper to EMNLP 202